Preserve durable trace context in SQL history#306
Open
chandramouleswaran wants to merge 5 commits intomicrosoft:mainfrom
Open
Preserve durable trace context in SQL history#306chandramouleswaran wants to merge 5 commits intomicrosoft:mainfrom
chandramouleswaran wants to merge 5 commits intomicrosoft:mainfrom
Conversation
Keep orchestration replay span identity and sub-orchestration client span IDs when round-tripping SQL trace context so exported spans stitch correctly across continuations and nested sub-orchestrations. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fanyirobin
reviewed
Apr 21, 2026
fanyirobin
reviewed
Apr 21, 2026
Contributor
|
overall LGTM, the failed CI test might be flaky. I run into this failure often as well. |
… test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Preserves DurableTask trace/span identity when SQL persists history by extending the TraceContext encoding (without a schema change) and adding regression tests that validate stable span IDs and correct parent/child hierarchies across replays/continuations and sub-orchestrations.
Changes:
- Extend SQL
TraceContextserialization/parsing to preserveDistributedTraceContext.Id/SpanIdand sub-orchestrationClientSpanId. - Rehydrate
ClientSpanIdwhen materializingSubOrchestrationInstanceCreatedEvent. - Add unit + integration tests that lock down round-tripping and emitted Activity parent/child relationships.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| test/DurableTask.SqlServer.Tests/Unit/SqlUtilsTraceContextTests.cs | Adds unit coverage for extended/legacy trace-context round-tripping and ClientSpanId persistence. |
| test/DurableTask.SqlServer.Tests/Integration/Orchestrations.cs | Adds integration regressions that validate exported Activity graphs have stable orchestration span IDs and no missing parents. |
| src/DurableTask.SqlServer/SqlUtils.cs | Implements the extended trace-context encoding/decoding and sub-orchestration ClientSpanId hydration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…tions - Reserve line 1 of every TraceContext payload for traceparent (empty for sub-orchestration-only payloads) so all callers share a single parsing contract. Avoids the risk of treating '@clientspanid=...' as a malformed W3C traceparent if a sub-orchestration row is ever read through the general DistributedTraceContext path. - Centralize TraceContext parsing into a single ParseTraceContext helper. GetDistributedTraceContextFromReader and GetSubOrchestrationClientSpanId now share one parser, which extracts traceparent, tracestate, id, spanid, and clientspanid in one pass. Reduces drift over time. - Backward compatibility: the parser still accepts the legacy single-line '@clientspanid=...' format that was written by earlier builds, so an upgrade does not break histories already in production databases. Added a dedicated regression test for the legacy payload shape. - Tighten the sub-orchestration / activity client-span assertions in the TraceContextFlowCorrectly integration test to use Assert.Single rather than LastOrDefault. The test schedules exactly one of each, so any duplicate emission should now fail the regression deterministically. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DistributedTraceContextClientSpanIdso parent-side client spans can be recreated with the same span ID the child execution points toWhy this is needed
DurableTask.SqlServerflattens history-event trace data into theTraceContextSQL column instead of serializing the full data contract graph. Before this change, that projection only preserved W3Ctraceparentand optionaltracestate.That was not enough for the Activity-based tracing contract in
DurableTask.Core:DistributedTraceContext.IdandDistributedTraceContext.SpanIdare used to restore the same orchestration execution span across reloads/replays/continuations.SubOrchestrationInstanceCreatedEvent.ClientSpanIdis used to recreate the parent-side client span for a sub-orchestration with the exact span ID that the child orchestration server span references as its parent.When SQL dropped those values:
parentSpanIdthat the parent side never actually emitted, which shows up as a missing parent/orphaned span in trace viewersWhat changed
TraceContextencoding to preserve extra durable trace fields without a schema change@tracestate=@id=@spanid=@clientspanid=ClientSpanIdwhen materializingSubOrchestrationInstanceCreatedEventDistributedTraceContextround-trippingClientSpanIdround-trippingParentSpanIdpointing to a missing span in the captured Activity graphScope note
This change fixes newly persisted SQL histories. It does not retroactively backfill older in-flight histories that were already stored in the legacy trace-context format before the provider change.
Validation
Sample screenshot for illustration of the problem - span's with incorrect parents ID.